Goto

Collaborating Authors

 target value


Sobolev Training for Neural Networks

Neural Information Processing Systems

At the heart of deep learning we aim to use neural networks as function approximators - training them to produce outputs from inputs in emulation of a ground truth function or data creation process. In many cases we only have access to input-output pairs from the ground truth, however it is becoming more common to have access to derivatives of the target output with respect to the input -- for example when the ground truth function is itself a neural network such as in network compression or distillation. Generally these target derivatives are not computed, or are ignored. This paper introduces Sobolev Training for neural networks, which is a method for incorporating these target derivatives in addition the to target values while training.










Supplementary information 1 Simulation parameters

Neural Information Processing Systems

All simulations were based on pytorch [5]. For the nonlinear neuroscience tasks, we applied the gradient descent method "Adam" [4] to the recurrent weights W as well as to the input and output vectors mi, wi. We checked that our results did not depend qualitatively on the choice of the "Adam" algorithm over plain gradient descent; however, training converged more easily for this choice of algorithm. We also checked that restricting training to W only (as for the simple model) did not alter our results qualitatively (although, with this restriction, training on the Romo task for small values of g did not converge). Code for reproducing our results can be found on https://github.com/frschu/neurips_